fix(llmobs): propagate distributed headers via signal dispatching, not config #12089

Yun-Kim · 2025-01-24T23:26:01Z

This PR makes a change to our shared distributed tracing header injection method to dispatch signals/events instead of relying on the global config settings, which is only modifiable via env vars. This fixes distributed tracing for users that might rely solely on the LLMObs.enable() setup config.

Programmatic LLMObs.enable()/disable() calls do not set the global config._llmobs_enabled boolean setting, which is only controlled by the DD_LLMOBS_ENABLED env var. This was problematic for users that relied on manual LLMObs.enable() setup (i.e. no env vars) because our distributed tracing injection code only checks the global config to inject llmobs parent IDs into request headers. If users manually enabled LLMObs without any env vars, then this would not be reflected in the global config value and thus LLMObs parent IDs would never be injected into the request headers.

We can't check directly if LLMObs is enabled in the http injection module because:

This would require us to import significant product-specific LLMObs-code into the shared http injector helper module which would impact non-LLMObs users' app performance
Circular imports in LLMObs which imports http injector logic to use in its own helpers

Instead of doing our check based on the global config._llmobs_enabled setting, we now send a tracing event to our shared product listeners, and register a corresponding LLMObs._inject_llmobs_context() hook to be called for all inject() calls if LLMObs is enabled (we check the LLMObs instance, not the global config setting value).

One risk and why I don't like changing global config settings is because this then implies that it is no longer global or tied to an env var (I want to push for env var configuration where possible over manual overriding/enabling). If a global enabled config can be toggled indiscriminately then this could open a can of worms for enabling/disabling logic in our LLMObs service, which isn't really designed to be toggled on/off multiple times in the app's lifespan. However if some users cannot rely on env vars, then I don't see any other solution that does not couple tracer internal code with LLMObs code which is a no-option. (UPDATE: we avoided this issue by using signal dispatching)

Checklist

PR author has checked that all the criteria below are met
The PR description includes an overview of the change
The PR description articulates the motivation for the change
The change includes tests OR the PR description describes a testing strategy
The PR description notes risks associated with the change, if any
Newly-added code is easy to change
The change follows the library release note guidelines
The change includes or references documentation updates if necessary
Backport labels are set (if applicable)

Reviewer Checklist

Reviewer has checked that all the criteria below are met
Title is accurate
All changes are related to the pull request's stated goal
Avoids breaking API changes
Testing strategy adequately addresses listed risks
Newly-added code is easy to change
Release note makes sense to a user of the library
If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
Backport labels are set in a manner that is consistent with the release branch maintenance policy

github-actions · 2025-01-24T23:26:29Z

CODEOWNERS have been resolved as:

releasenotes/notes/fix-llmobs-enable-updates-config-45379a7a30e2e0e3.yaml  @DataDog/apm-python
ddtrace/llmobs/_llmobs.py                                               @DataDog/ml-observability
ddtrace/propagation/http.py                                             @DataDog/apm-sdk-api-python
tests/llmobs/test_llmobs_service.py                                     @DataDog/ml-observability
tests/llmobs/test_propagation.py                                        @DataDog/ml-observability
tests/tracer/test_propagation.py                                        @DataDog/apm-sdk-api-python

ddtrace/llmobs/_llmobs.py

tests/llmobs/test_llmobs_service.py

pr-commenter · 2025-01-25T00:22:46Z

Benchmarks

Benchmark execution time: 2025-01-28 17:30:28

Comparing candidate commit aba113b in PR branch yunkim/llmobs-fix-enable with baseline commit 4f0bcb5 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 394 metrics, 2 unstable metrics.

sabrenner

LGTM, thanks for fixing! re:

We can't check directly if LLMObs is enabled in the http injection module

I'm not sure about the performance implications of core.dispatch, but I wonder if that could be of use, i.e., adding:

core.dispatch("tracer.span.inject", (headers, ))

from the HTTPPropagator.inject method and then adding our parent id to headers['x-datadog-tags']. not relevant to the fix here per se, but might make it easier for these injections to happen automatically thru auto-instrumented requests libraries instead of a user needing to call LLMObs.inject_distributed_headers directly. not sure of the validity here but maybe something we can follow up on 😄

datadog-dd-trace-py-rkomorn · 2025-01-27T20:48:24Z

Datadog Report

Branch report: yunkim/llmobs-fix-enable
Commit report: aba113b
Test service: dd-trace-py

✅ 0 Failed, 130 Passed, 1378 Skipped, 4m 54.85s Total duration (35m 5.82s time saved)

sabrenner

new implementation lgtm!

github-actions · 2025-01-28T22:12:01Z

The backport to 2.19 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.19 2.19
# Navigate to the new working tree
cd .worktrees/backport-2.19
# Create a new branch
git switch --create backport-12089-to-2.19
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 c4448ea4e5580cfa55684ef108a5dafafe1f6f75
# Push it to GitHub
git push --set-upstream origin backport-12089-to-2.19
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.19

Then, create a pull request where the base branch is 2.19 and the compare/head branch is backport-12089-to-2.19.

github-actions · 2025-01-28T22:12:01Z

The backport to 2.20 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.20 2.20
# Navigate to the new working tree
cd .worktrees/backport-2.20
# Create a new branch
git switch --create backport-12089-to-2.20
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 c4448ea4e5580cfa55684ef108a5dafafe1f6f75
# Push it to GitHub
git push --set-upstream origin backport-12089-to-2.20
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.20

Then, create a pull request where the base branch is 2.20 and the compare/head branch is backport-12089-to-2.20.

…t config (#12089) This PR makes a change to our shared distributed tracing header injection method to dispatch signals/events instead of relying on the global config settings, which is only modifiable via env vars. This fixes distributed tracing for users that might rely solely on the `LLMObs.enable()` setup config. Programmatic `LLMObs.enable()/disable()` calls do not set the global `config._llmobs_enabled` boolean setting, which is only controlled by the `DD_LLMOBS_ENABLED` env var. This was problematic for users that relied on manual `LLMObs.enable()` setup (i.e. no env vars) because our distributed tracing injection code only checks the global config to inject llmobs parent IDs into request headers. If users manually enabled LLMObs without any env vars, then this would not be reflected in the global config value and thus LLMObs parent IDs would never be injected into the request headers. We can't check directly if LLMObs is enabled in the http injection module because: 1. This would require us to import significant product-specific LLMObs-code into the shared http injector helper module which would impact non-LLMObs users' app performance 2. Circular imports in LLMObs which imports http injector logic to use in its own helpers Instead of doing our check based on the global `config._llmobs_enabled` setting, we now send a tracing event to our shared product listeners, and register a corresponding `LLMObs._inject_llmobs_context()` hook to be called for all inject() calls if LLMObs is enabled (we check the LLMObs instance, not the global config setting value). ~One risk and why I don't like changing global config settings is because this then implies that it is no longer global or tied to an env var (I want to push for env var configuration where possible over manual overriding/enabling). If a global enabled config can be toggled indiscriminately then this could open a can of worms for enabling/disabling logic in our LLMObs service, which isn't really designed to be toggled on/off multiple times in the app's lifespan. However if some users cannot rely on env vars, then I don't see any other solution that does not couple tracer internal code with LLMObs code which is a no-option.~ (UPDATE: we avoided this issue by using signal dispatching) - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

Yun-Kim requested a review from a team as a code owner January 24, 2025 23:26

Kyle-Verhoog reviewed Jan 24, 2025

View reviewed changes

ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved

Yun-Kim force-pushed the yunkim/llmobs-fix-enable branch from 6797009 to 79f185d Compare January 24, 2025 23:30

Yun-Kim requested a review from a team as a code owner January 24, 2025 23:30

Yun-Kim requested review from duncanista and mabdinur January 24, 2025 23:30

Toggle config when enabled/disabled

883953c

Yun-Kim force-pushed the yunkim/llmobs-fix-enable branch from 79f185d to 883953c Compare January 24, 2025 23:33

Kyle-Verhoog reviewed Jan 24, 2025

View reviewed changes

tests/llmobs/test_llmobs_service.py Outdated Show resolved Hide resolved

Add test case ensuring falsey env var value takes precedence

0d35665

Yun-Kim added backport 2.20 backport 2.19 labels Jan 27, 2025

sabrenner approved these changes Jan 27, 2025

View reviewed changes

Yun-Kim added the manual merge Do not automatically merge label Jan 27, 2025

Kyle-Verhoog approved these changes Jan 27, 2025

View reviewed changes

Yun-Kim requested a review from a team as a code owner January 27, 2025 20:35

Yun-Kim changed the title ~~fix(llmobs): toggle config when enabled/disabled programmatically~~ fix(llmobs): propagate distributed headers via signal dispatching, not config Jan 27, 2025

Use signal dispatchers instead of overriding global config

db253b7

Yun-Kim force-pushed the yunkim/llmobs-fix-enable branch from f8e5157 to db253b7 Compare January 27, 2025 21:52

Yun-Kim and others added 2 commits January 27, 2025 16:52

Merge branch 'main' into yunkim/llmobs-fix-enable

8c259ee

fmt

2aeef6a

sabrenner approved these changes Jan 27, 2025

View reviewed changes

Kyle-Verhoog approved these changes Jan 28, 2025

View reviewed changes

Merge branch 'main' into yunkim/llmobs-fix-enable

aba113b

mabdinur approved these changes Jan 28, 2025

View reviewed changes

Yun-Kim merged commit c4448ea into main Jan 28, 2025
638 checks passed

Yun-Kim deleted the yunkim/llmobs-fix-enable branch January 28, 2025 22:11

Yun-Kim mentioned this pull request Jan 28, 2025

fix(llmobs): propagate distributed headers via signal dispatching [backport #12089] #12134

Open

2 tasks

Yun-Kim removed the backport 2.19 label Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llmobs): propagate distributed headers via signal dispatching, not config #12089

fix(llmobs): propagate distributed headers via signal dispatching, not config #12089

Yun-Kim commented Jan 24, 2025 •

edited

Loading

github-actions bot commented Jan 24, 2025 •

edited

Loading

pr-commenter bot commented Jan 25, 2025 •

edited

Loading

sabrenner left a comment

datadog-dd-trace-py-rkomorn bot commented Jan 27, 2025 •

edited

Loading

sabrenner left a comment

github-actions bot commented Jan 28, 2025

github-actions bot commented Jan 28, 2025

fix(llmobs): propagate distributed headers via signal dispatching, not config #12089

fix(llmobs): propagate distributed headers via signal dispatching, not config #12089

Conversation

Yun-Kim commented Jan 24, 2025 • edited Loading

Checklist

Reviewer Checklist

github-actions bot commented Jan 24, 2025 • edited Loading

pr-commenter bot commented Jan 25, 2025 • edited Loading

Benchmarks

sabrenner left a comment

Choose a reason for hiding this comment

datadog-dd-trace-py-rkomorn bot commented Jan 27, 2025 • edited Loading

Datadog Report

sabrenner left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 28, 2025

github-actions bot commented Jan 28, 2025

Yun-Kim commented Jan 24, 2025 •

edited

Loading

github-actions bot commented Jan 24, 2025 •

edited

Loading

pr-commenter bot commented Jan 25, 2025 •

edited

Loading

datadog-dd-trace-py-rkomorn bot commented Jan 27, 2025 •

edited

Loading